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(57) Abstract 

The present invention, genmlly speaking, 
provides a reconfigurable computing solution that 
offers the flexibility of software development and 
the performance of dedicated hardware solutions. 
A reconfigurable processor ch^ includes a stan- 
dard processor, blocks of reconfigurable logic 
(1101,1 103), and interftices (319a, 319b, 31 1) be- 
tween these elements. The dhxp allows application 
code to be recompiled into a combinati<m of soft- 
ware and reloadable hardware blocks using corre- 
sponding software tools. A mixture of arithmetic 
cells and logic cells allows for higher effective 
utilization of 8ilic<»i than a standard interconnect 
More ef&dent use of configuradon stadc m^ory 
results, since different sections of converted code 
require difierent portions of ALU functions and 
bus interconnect. Many types of interfaces widi 
die embedded processor are provided, aUowing 
for ^t interface between standard processor code 
and configurable 'liard-wufed" functions. 
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AN INTEGRATED PROCESSOR AND PROGRAMMABLE DATA PATH 
CHIP FOR RECONFIGURABI^ COMPUTING 

1. Field of the Invention 

The present invention relates to reconfigurable computing. 

5 2. State of the Art 

As the cost of increasingly complex integrated circuits continues to fall, 
systems con^anies are increasingly embedding RISC processors into 
non-computer systems. As a result, whereas the bulk of development work used 
to be in hardware design, now it is in software design. Today, whole 

10 applications^ such as modems, digital video decompression, and digital 

telephony, can be done in software if a sufficiently high-performance processor is 
used. Software development offers greater flexibility and faster time-to-market, 
helping to offset the decrease in life cycle of today's electronic prckiucts. 
Unfortunately, software is much slower than hardware, and as a result requires 

15 very expensive, high-end processors to meet the computational requirements of 
some of these applications. Field Programmable Gate Arrays (FPGAs) are also 
being increasingly used because they offer greater flexibility and shorter 
development cycles than traditional Application SpeciiBc Integrated Circuits 
(ASICs), while providing most of the performance advantages of a dedicated 

20 hardware solution. For this reason, conq)anies providing field programmable or 
embedded processor solutions have beeh growing very rapidly. 

It has long been known in the software industry that typically most of the 
computation time of any application is spent in a small section of code. A general 
trend in the industry has been to build software applications, standardize the 

25 interfaces to these computationally intensive sections of code, and eventually turn 
tbem into dedicated hardware. This approach is being used by many companies to 
provide chips that do everything from video grapM^ 
digital video decompression. The problwn with this^a^^ 
chq>s generally take one or more years to create 
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specific tasks. As a result, companies have begun providing complex digital 
signal processing chips, or DSPs, which can be programmed to perform some of 
these tasks. DSPs are more flexible than hardware but are less flexible than 
standard processors for purposes of writing software. 
5 The logical extension of the foregoing trends is to create a chip which is a 

processor with dedicated hardware that replaces the computationally intensive 
sections of the application code. In fact, most complex MPEG chips already 
include a dedicated embedded processor, but are nevertheless not very flexible. 
Unfortunately, FPGAs, while they provide greater flexibility, are only 5-10% as 

10 dense as gate arrays per usable function. Since there are usually many different 
sections of computationally intensive code that must be executed at different 
times within any given application, a more efficient way of using the inherently 
inefficient FPGA logic is to repeatedly load each specific hardware logic function 
as it is needed, and tihen replace it with the next function. This technique is 

15 referred to as reconfigurable computing, and is being pursued by university 

researchers as well as FPGA con5)anies such as Xilinx and others. U S. Patent 
5,652,875 describes a "selected instruction set" computer (SI^^ 
implemented in progranmiable hardware. A related patent is U.S. P^^ 
5,603,043. Both of these patents are incorporated herein by reference. 

20 One aspect of reconfigurable computing involves configuration memory 

structures that aUow for configuration data to be changed rapidly.^ A^ 
a single-bit portion of a conventional configuration memory structure is shown in 
Figure 1. The configuration memory strucmre may be represented by 
intercomiected tri-statie buffers. A dato bit is 

25 memory structure by enabling one or more tri-state buffers. Two separate 

memory planes are indicated. Plane 0 and Plane 1 . The contents of Plane 1 may 
be applied to FPGA logic by enabling buffers 101 and 103. The contents of Plane 
1 and Plane 0 may be exchanged by enabling buffers 101, 105 and 107. Plane 0 
and Plane 1 may also be written from an external sotirce by enabling buffers 109 

30 and 11 1, respectively. The arrangement of Figure 6 limits the planes to iserial 

-2- ■ " 
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execution and does not allow for sharing of memory planes. In particular, the 
FPGA contents cannot be recirculated for storage into the underlying memory 
planes. 

Another memory arrangement is described in U.S. Patent 5,246,378, 
5 incorporated herein by reference. In accordance with the teaching:s of this patent, 
data defining alternate configurations of reconfigurable logic are stored in stored 
in different, logically separate memories. Selection circuitry, such as 
miiltiplexers, selects between outputs of the different memories and causes the 
selected outputs to be applied to reconfigurable logic. Time-sliced operation is 
10 described. 

Another aspect of reconfigurable computing involves "wildcarding," i.e., 
writing more than one word of configuration memory simultaneously as a result 
of a single write access, described in U.S. Patente 5,500,609 and 5,552,772, 
both of which ate incorporated herein by reference. 
15 Despite the foregoing efforts, there remains a need for a low-cost, 

high-performance, flexible reconfigurable computing solution. The present 
invention addresses this need. 

SUMMARY OF THE INVENTION 
The present invention, generally speaking, provides a reconfigurable 

20 computing solution that offers the flexibility of software development and the 
performance of dedicated hardwafire solutions. A relatively inexpensive 
reconfigurable processor chip includes a standard process^^^ 
reconfigurable logic, and interfaces between these elements. The chip allows 
application code to be reconq>iled into a combination of software and reloadable 

25 hardware blocks using corresponding software tools . Various features of the 
reconfigurable processor chq) enable it to achieve a lower-cost, 
higher-performance solution than pure processors . A mixture of arithmetic cells 
and logic cells allows for hi^er effective utilization of silicon than a standard 
FPGA. Configuration planes may be shared between ALU ftmctions and bus 
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interconnect. More efficient \ise of configuration stack memory results, since 
different sections of converted code require different proportions of ALU 
functions and bus interconnect. Many different types of interfaces with the 
embedded processor are provided, allowing for fast interface between standard 
5 processor code and the configurable "hard-wired" functions. 

BRIEF DESCRIPTION OF THE DRAWING 

The present invention may be further understood from the following 

description in conjimction with the appended drawing. In the drawing: 

Figure 1 is a simplified diagram of a conventional configuration memory 
10 structure; 

Figure 2 is a sin^lified block diagram of an Adaptive Confute Engine 
(ACE); 

Figure 3 is a more detailed floorplan of the Reconfigurable Compute 
Engme (RCE) of Figure 2; 

15 Figure 4 is a more detailed block diagram of one possible organization of 

the LSM of Figure 2; 

Figure 5 is a block diagram illustrating one possible arrangement in which 
data is held in place and operators are reconfigured around the data; 

Figure 6 is a more detailed block diagram of one possible organization of 
20 the ACM of Fig:ure 2and Figure 3; 

Figure 7 is a more detailed block diagram of another possible organic 
of the ACM; 

Figure 8 is a block diagram of a further possible organization of the 

. . ACM;: ■ 

25 Figure 9 is a diagram of a logic symbol for one possible realization of a 

Data Path Unit 0PU); 

Figure 10 is an exeniplary datapath circuit realized u^ 
type shown in Figure 9; 

Figure 11 is a smq)lified block diagram of the ACM fabric; 
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Figure 12a is a block diagram of a portion of a multiple plane LSM 
corresponding to a block of the ACM; 

Figure 12b is a diagram of a groi^) of corresponding memory cells, one 
cell from each plane of the memory stack Figure 12a; 

5 Figure 12c is a diagram of an alternative embodiment of the memory 

stack of Figure 12a in which separate "function" and "wire" stacks are 
provided; 

Figure 12d is a diagram of separate memory stacks provided for control, 
datapath and memory configuration, respectively; 

10 Figure 12e is a diagram of a common memory stack provided for control, 

datapath and memory configuration; 

Figure 13 is a schematic diagram of an alternative embodiment for a 
single bit of the memory stack of Figure 12a; 

Figure 14 is a diagram representing an addressing portion of the LSM 
15 fabric; 

Figure 15a through Figure 15f are diagrams showing patterns of memory 
cells written simultaneously; 

Figure 16 is a block diagram of the ACE showing coiq>ling of the 
processor core with the reconfigiurable fabric; 

20 Figure 17a is a diagram of a first exemplary configuration of ACM blocks 

according to various functions; 

Figure 17b> a diagram of a second exemplary configuration of ACM 
blocks according to various functions; 

Figure 18a is a diagram of a functionmap table used during loading of 
25 functions; 

Figure liSb is k diagrani of block corifiguration w 
execution of functions; and 

Figure 19 is a pseudocode listing of an exception handling routine. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
Referring now to Figure 2, a conceptual block diagram of an adaptive 
compute engine (ACE) in accordance with the present invention is shown. The 
ACE includes a reconfigurable compute engine (RCE) core 300, together with 
5 various hardwired blocks tliat support the RCE. In an exemplary embodiment, 
these hardwired blocks include the following: Peripheral Component Interface 
(PCI) 201; General Purpose Programmable Input/Output (GPI/0) 203a, 203b; 
Configurable Memory Interface (CMI) 205; Timer Bank Module (TBM) 207; 
Phase Lock Loop (PLL) 209; Baud Rate Generators (ERG) 211; Interrupt 

10 Control Block (ICB) 213;. Peripheral Device Interface (PDI) 215; Du-ect Memory 
Access (DMA) circuitry 217; Time Slot Assign/Coherency Tags (TSA) 219; and 
Sy stem Control Module (SCM) 221. 

The RCE core 300 includes a CPU 301 (e.g, a RISC microprocessor), a 
local store memory (LSM) 400, and an adaptive compute module (ACM) 600. 

15 Preferably, the RCE core 300 is part of a single ACE integrated circuit. The 
particular topology of the integrated circuit is not critical for purposes of tiie 
present invention. However, several important aspects of such an integrated 
circuit in accordance with a preferred embodiment of the invention are illustrated 
Figiffe 3 , showing a floor plan of the RCE core 300 of Figure 2. The RCE core 

20 includes a microprocessor portion 301, an mterface portion 310, and an ACM 
portion 320. The ACM portion 320 is further subdivided into slices of 
reconfigurable logic. In an exemplary embodiment, the slices of reconfigurable 
logic include control slices 323a, 323b, .1., and corresponding datapath slices 
327a, 327b, .... In the example of Figure 3, data flows in a horizontal direction 

25 and control signals nm from respective control slices to respective datapath slices 
in the vertical direction. An LSM array (also "sliceable") 325 may be located 
amidst the slices as shown, or may be located beside the slices. The 
microprocessor 302 communicates with the slices through bus interfaces 319a, 
3 19b, ... , and communicates with tiie LSM array through a memory interface 
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311. Also provided are a Media Access Controller (MAC) 304 and an external 
memory interface 306. 

Although not separately illustrated in Figure 3, each of the slices of 
reconfigurable logic, as well as the local store memory (LSM), include 
5 configuration memory for that portion. In other words, configuration memory for 
the blocks illustrated on the left-hand side of Figure 3 will most likely be merged 
together with those blocks in a "fabric," i.e., a highly regular circuit stracture. 
Many different types of reconfigurable fabrics are well-known in the art, 

A block diagrani of one possible implementation of the LSM is shown in 

10 Figure 4. In this embodiment, the LSM is comprised of a tiled set of storage 
cells. The "M" cells are nibble oriented storage structures that allow multi-port 
access in two dimensions. The "T" cells are optionally used bit level cells 
associated with the M cells for either tag bit or error bit usage. The storage 
blocks can be further grouped into larger stmctures to support larger bit widths. 

15 In conventional AJSIC implementations, arithmetic data operators are 

constructed sequentially, forming a row or path of operators. The resulting row 
of logic operators, multiplexers and registers is called a "datapath." Data travels 
down tWs path undergoing various operations and tramformations. 

The ACM/LSM adaptive computation fabric, on the other hand, is 

20 structured by using configuration data bits. The configuration bits are organized 
in multiple planes of storage. Swapping configuration planes swaps the logic in 
the ACM. Data can be held in place and the operators reconfigured around the 
data as shown, for example in Figure 5. On a first cycle, data passes from a first 
register 501 through a "cloud" of reconfigurable logic 503 to a second register 

25 505. The cloud of logic is then reconfigured, and one a subsequent cycle, the 
data passes back firom the second register 505 through the cloud of logic 503 to 
the first register 501 , By operating on the'data on multiple passes through the 
cloud of logic, which may be configured differently during each pass, the 
equivalent of an arbitrarily long datapath may be realized in ping-pong fashion. 



-7- 
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Multiplexing different operators onto the same logic fabric saves valuable 
silicon area, providing a "virtual density" improvement. As described hereinafter, 
the use of multiple device configuration planes allows for virtually instantaneous 
reconfiguration. Furthermore, memory bandwidth requirements for loading a 
5 configuration plane are dramatically reduced using compression techniques. 

Unlike existing FPGAs, the present ACM is a heterogeneous configurable 
fabric of control, datapath and memory partitions, including a fine-grained 
control structure that is used to control a coarse-grained datapath structure. The 
reconfigurable compute fabric may consist of a number of tiled cells that extend 

10 in the X and Y coordinate system, including DPUs (Data Path Units) and the 
associated ICM (Interconnection Module) components. The DPUs provide the 
data path fimctionality for the behavioral mapping and the ICMs define the bus 
oriented interconnection between the DPUs. Preferably, the control portion and 
the LSM memory fabric are defined in a similar fashion. 

15 Referring more particularly, to Figure 6, a more detailed block diagram is 

shown of the ACM of Figure 2 and Figiire 3. Corresponding reference numerals 
are used to indicate correspondiiag elements in Figure 3 and Figure 6. A 
fine-grained control structure fabric 610a, 610b consists of tiled Boolean Logic 
Units (BLUs) 611a, 611b. The tiled BLU array interfaces to a global signal 

20 coirtrol bus and CPU register control interface 609. The global signal bus 609 
allows clock gating of registered variables or bidirectional steerage of data 
values. The BLUs are bit level oriented cells for orthogonal control of the 
ACM's datapath DPU partition slices 620a, 620b. This control can be in the form 
of cones of combinatorid logic or smaU state niachines. 

25 The datapath partition is a sliceable structure comprised of multiple bit, 

coarse-graii^ configunible dat£^ath cells, DPUs (Datapath Program Units) 
621a, ... , 621b ... , that efficiently support typical arithmetic and bit 
multiplexing operators. The DPUs operate on data in 4 bit nibbles. This allows 
the datapath fabric to be implemented in a denser, coarse-grained silicon 

30 inq)lementation, compared to current FPGA technology, which uses inefficient. 
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bit-oriented logic elements (fine-grained). The coarser-grained aggregation of 
data also allows construction of high performance, long bit width arithmetic 
function modules such as multipliers and adders. Fewer bits of control for logic 
configuration are required, compared to conventional bit-oriented FPGA 
5 structures. Interconnection Modules 630a, 630b are used to communicate with 
the LSM storage mechanism for high bandwidth data traffic for queuing or loop 
processing. 

In Figure 6, configuration memory planes underlying each of the various 
reconfigurable structures are explicitly shown. This representation is a logical 

10 representation of the ACM and not necessarily a physical representation. 

Physically, the structures illustrated in three dimensions in Figure 6 may be 
niapped to two dimensions. 

Referring to Figure 7, in an alternative in:q)lementation, the LSM is 
realized in distributed fashion, e.g. , as 4 x 4 blocks of memory interspersed with 

15 the DPUs. Dispersing the LSM relieves a possible memory bottleneck. Instead of 
accessing the LSM through the routing/memory interface, external system 
memory can be accessed through the routing/memory interface. In Figure 7, 
DPUs and LSM blocks alternate in the vertical direction. That is, datapath slices 
alternate with LSM slices. Referring to Figure 8, DPUs and LSM blocks instead 

20 alternate in the horizontal (dataflow) direction. This layout models typical 
algorithni flow of operator, storage, bperiator, storage in a pipelined 
implementation. 

Many different types of DPUs are possible. A logic symbol for one 
possMe DPU is shown in Figure 9. The DPU operates as set forth in Table 1. 
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Table 1 



15 



OP r'ri'nT? 


1 OPERATION 


COMMENT 1 


uuuu 


NOP 


Passes A and B through to the higher j 
and lower output bits, resuectivelv 


0001 


SUB 


Cin must be 1 | 


nnin 

lA/lU 


AND 




AAI 1 

UOll 


MUL 


May increment as well if Cin = 1 | 


0100 


OR 




0101 


INC 


Cin must be 1; increments A and B 
together as a four-bit nmnber ( 


0110 


XOR 




0111 


ADD 




1000 


0 VY/Yx 


rasses 15 and A through to the higher 
and lower output bits, respectively | 


1001 


OXXLF J.H^ 


worKs on all four u^ut bits, not just 
two j 


1010 


ROTl 


worjcs on an tour input bits, not just 1 
two j 


1011 




wurjtb on all rour mput Dits, not just 1 
two 


1100 


ROT2 


Works on all four ii^ut bits, not just 
two 


1101 


SHIFT2 


Works on all four iiqjut bits, not just 
two 


1110 


ROT3 






SHIFTS 





An exemplary datapath circuit realized using such DPUs is shown in 
20 Figure 10. 
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An important feature of the RCE core is the ability to dynamically 
reconfigure the ACM on the fly in a very short amount of time— typically less 
than the amount of time required for a memory read operation in a conventional 
computer. The structure of the ACM/LSM fabric is specially adapted to enable 
5 this type of operation. More particularly, the ACM/LSM includes multiple 
logical memory planes, e.g., four memory planes, eight memory planes, etc. 
Any number of planes may be provided for (including numbers not powers of 
two). 

Referring to Figure 11, a conceptual block diagram is shown of one block 

10 of the ACM/LSM fabric. The fabric includes control reconfigurable logic (C-RL) 
1101, datapath reconfigurable logic (D-RL) 1103, and reconfigurable memory 
1105. Associated with each of these structures is multiple planes of configuration 
storage, i.e, control configuration storage 1107, datapath configuration storage 
1109 and memory configuration storage 1111. 

15 A particular embodiment of a portion of a multiple plane corresponding to 

a block of the ACM/LSM fabric is shown in Figure 12a. The multiple memory 
planes form in effect a memory plane stack 1200. In the case of a DP-RL block, 
the top two planes 1206, 1205 of the memory plane stack are configuration 
planes. Configuration data stored in these planes is applied to the reconfigurable 

20 logic, in the illustrated embodiment, "function" configuration data and "wire" 

configuration data is istored in different planes, ^1^ bottom memory plane 1200a 
provides external access to the memory stack. Intermediate planes function, for 
example, as a configuration stack, storing configurations expected to be used by 
not presentiy active. In an exemplary embodiment, memory plane 0 is single 

25 port, for single-channel read and write between system memory and 

configuration storage. The remaining memory planes are dual port, havii^ one 
read port and one write port. Dual port supports simultaneous loading and 
recirculation of configuration data with the local "stack. " If no data compression 
is used, then simultaneous real-time liionitoring is possible, e.g., by writing out a 

30 "snapshot" of one or more planes of the stack. 
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A group Of corresponding memory cells, one ceU from each plane of the 
memory stack, is shown in Figure 12b. The ports of aU of the cells are 

interconnected so as to aUow an operation in which the contents of a ceU within 

any plane may be read and then written to the corresponding cell of any other 

5 plane. For example, by activating the appropriate control signal, the contents of 

plane 4 may be read and written into plane 6. Such an operation may be 
accomplished, preferably , in a single clock cycle, or at most a few clock cycles. 
As described more fully hereinafter, configuration data is loaded from external 
main memory into plane 0 of the memory stack in anticipation of its being 
0 transferred into a configuration plane. 

Alternatively, separate "function" and «wire» stacks may be provided, as 
shown in Figure 12c. Using this arrangement, function and wire configurations 
may be changed simultaneously. Similarly, configuration stacks for configuration 

of control, datapath and memory may be combined (Figure 12d) or separate 
(Figure 12e). 

A schematic diagram of an alternative embodiment of a ceU stack is 
shown in Figure 13, showing a cross section of several configuration planes 
1301-1304 and the lockable fabric-definition ceU 1305 that produces a 
Fabric_Define_Data bit for a single bit location. These bits are aggregated in 
order to form sufficient bit numbers for functional ceU type definition. For 
instance, a four bit grouping might designate between four to sixteen difierent 
ceU type definitions, the oflier latch sites below die storage ceU are for additional 
configuration plane data available for swapping as needed by functional 
scheduling requirements . These storage locations can be written and read to from 
a conmon configuration data bus stiiictiire. The Config_Read_Data and 
Config_Load^Data buses 1307 and 1309, although shown as being separate, can 
be combined as a single bi-directional bus for wiring efficiency. This bus 
structure aUpws configuration data to be written as needed. The 
Swap_Read_Plane buffer 1311 allows existing configuration plane data contents 
to be swapped among differing configuration planes on a selectable basis. For 
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instance, the current operation plane of data can be loaded from configuration 
plane 1 to configuration plane 2 by the use of the Swap_Read_Plane buffer 1311. 
The structure shown in Figure 13 is similar to a conventional SRAM memory 
structure which allows a dense circuitry implementation using standard 
5 memory compiler technology. This structure could also be implemented as a 
conventional dual port RAM structure (not shown) which would allow for 
concurrent operation of the write and read data operations. Unlike Figure 12b, 
the example of Figure 13 assumes separate configuration stacks for each 
configuration plane as described hereinafter. That is, the bit stack produces only 
10 a single Fabric_Define_Data; bit instead of multiple fabric definition data bits as 
in Figure 12b. 

If the Data;_Recirc_Read line 1313 is also connected to data storage 
locations that are used for normal circuit register operation, then real time 
monitoring of device operations can be utilized by the operating system for 

15 applications such as RMON in internetworking application ^ea or for real time 
debug capability/The RMON application basically uses coimter operation status 
from registers in order to determine system data operation flow characteristics. 

Figure 14 is a system level perspective of an access portion of the 
configurable ACM LSM, which provides the functionality necessary to configure 

20 an operable plane of logic. (The logic shown is at a symbolic level of 

representation while the actual logic to perform the cell selection and address 
decode can vary according to techniques commonly used for address and data for 
SRAM structures.) In this embodiment, a set of X and Y decode latciies with 
associated butfers 1401 , 1403 drive decode enable signals into the tiled logic 

25 plane consisting of a replicated stmcture composed of NAND gates 1405, 1407 
and a configuration plane logic cell 1409 of the type described in relation to 
Figure 14. The combination X and ITdecode structure enables arbitrary 
collections of cell sites to be addressed by the corresponding X and Y decode 
enables, which are shown NANDed together to provide row/colui^ 

3d capability. The address bus 1411 selects a particular configuration plane and is 
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globaUy broadcast into the slice of the larger array to be programmed for either 
read or write of configuration data. The configuration data bus in not shown for 
simplicity. In the ilhistrated embodiment, the global address bus 1411 is decoded 
at each cell by the use of local cell decode logic (NAND gates 1405). 
Alternatively, the global address bus may be implemented m terms of 
straight-line, single-bit word lines. 

The structure of Figure 14 allows programming compression to be 
accomplished by running a compression program on the configuration map to 
find the commonly repeating structures so that they may be written 
simultaneously. This measure will significantly reduce both the size of the data 
file and the corresponding load time, since most of the like datapath elements will 
be repeating both horizontally and vertically. Configuration patterns such as 
fliose shown in Figure 15a through Figure 15f. The cells that correspond to a 
"maximal function" having highest utilization are globally selected by the X/Y 
decode latches for maximal coverage, and a configuration plane address is 
broadcast, designating a particular configuration plane layer. A global data bus 
(not shown) tiien loads a data value tiiat corresponds to a given logic operator or 
wiring configuration. The next most commonly used function may then be loaded 
m a like process. The next configuration mapping of commonly used cell types 
can in fact over-write ceU locations from the previous load ceU type operation. 
That is, successive cell type load operations can supersede previous cell coiitent 
loading. This method of loading aUows the maximal fonctions to be stitched into 
tiie configuration febric as needed m arbitrary cell locations. The ordering of ceU 
types by usage for a given configuration plane allows the compression of 
information content such that individual addressing scdieines for each ccU location 
are not necessary. 

The foregoing discussion has focused on the recdnfigurable ACM/LSM 
fabric. The remainder of the discussion wiU focus on the interface between flie 
fabric and the microprocessor. The microprocesasor foUows a standard MSG 
arcWtecture and has multiple coprocessor and special instnictions that 
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used to interface with the reconfigurabie logic. If the instructioiis are not used, 
then the configuration programming automatically adds default tie-off conditions 
(for cells that are not used or to safely configure routing to prevent interference 
of operations). In an exemplary embodiment, the microprocessor interfaces with 
5 the reconfigurable logic through some or ail of the following mechanisms: 

1) Via the system bus (memory mapped). 

2) Via a coprocessor bus. 

3) Via a special instruction interface (internal execution imit storage bus). 

4) Via special registers. 

10 In case (1), the reconfigurable memory or logic planes can be accessed by 

writing to or reading from a defined address space via the system bus. This 
operation appears as if it were a regular memory access. In case (2), there exists 
within the RISC architecture special instructions for loading coprocessor registers 
and turning control over to a coprocessor. The coprocessor (in this case the 

15 ACM/LSM) signals when it is complete, and the processor can load the contents 
of the coprocessor interface registers back into the processor. 

In case (3), there exists an interface off of the internal processor bus. One 
possible interface is shown in Figure 16, illustrating coupling of the processor 
core with the reconfigurable fabric. The processor core is realized as a four-stage 

20 pipeline including stages 1610, 1620, 1630 and 1640 (die execution stage). 

Within the execution stage 1640, an ALU and the ACM are tightly coupled. In 
particular, both the ALU and the ACM receive operational data from a register 
file in the stage 1630. A mapping is performed between a smaller nimiber of 
registers (e.g., 32) within the register file to a potentially much larger number of 

25 registers within the ACM, 

Special register-register or register-memory instructions cause two or 
more words to be loaded into a register at the boimdary of die bus. A mechanism 
is provided for stalling loading of results computed in the ACM and LSM fabric 
into the CPU register set, if iiecessary, to preserve sequential program execution 

30 integri^. The stall mechanism may take the form of a flag, a dedicated signal 
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line, etc. The results of the operation are placed within a set of special instruction 
registers. Any request to read the contents of a special mstniction register before 
the stall for those registers has been cleared stalls that read instruction. Finally, 
in case (4), the coprocessor or special instruction registers may be read or written 
5 by either the processor or the ACM. A clock offset from the processor clock may 
be provided to guarantee alternating read-write cycle operation if the ACM can 
keep up with the processor. 

In an exemplary embodiment, three specific types of special instructions 
are provided: 

10 1) Load instructions which load a plane within a block. 

2) Invoke mstnictions which transfer the contents of a plane to a 
configuration plane (wire or fimction). 

3) Execute instructions, which can be in any of the four cases above. 

Each of fllese types of instructions will be considered in turn in greater 

15 detail. 

Load instructions are used to load a plane within a block. Preferably, the 
ability to swap planes is available both to the microprocessor and to the 
reconfigurable logic blocks. More than one function can be mapped onto a plane 
within a block, or a smgle functions can take up more than one block or plane. 

20 Possible configurations are shown in Figure 17a and Figiffe 17b. 

Note that when a function is <x>iitained on two or more planes it is actually 
inultiply EDterlmfced. This is possible because the reconfigurable logic can invoke 
a function, and the regist^ contents of any plane can be preserved when the 
routing and function configurations are changed. Preferably, a mechanism is also 

25 provided for reading and writing the register contents fix)m the reconfigu^ 

logic as well. This allows the swappmg of the entire operation out and back, thus 
allowing one fimction to be overlayed by another with 
function's contents. 
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Development software is provided to optimally place Load and Invoke 
instructions within the instruction stream so as to minimize stalls within the 
process. Such software is described in U.S. Patent Application Serial No. 
08/884,377, incorporated herein by reference. Still, hardware must automatically 
5 trap invalid conditions m order to allow the processor to load and invoke the 

proper plane, and prohibit the processor from invoking a plane on top of a locked 
and executing process, unless the process is swappable or is expected to 
automatically abort is another executions is issued. These hardware ftinctions 
may be performed using the fimction map table of Figure 18a and the block 
10 configuration table of Figure 18b, 

Referring to Figure 18a, the fimction map table provides the module 
address for the fimction. The module address is the address in noain memory of 
the blocks, in compressed format, to be loaded. The fimction table also contains 
Plane Utilization Bits (PUBs), along with lock and swap bits for the function. 

15 The plane utilization bits are assigned based on execution ordering of fimctions 
that are mapped to modiHes in hardware for sequential program execution. 

The Load fimction issues a soft interrupt which is handled by an on-chip 
"KMni operating system" in a manner similar to a supervisor call. The old 
fimctions in the table are cleared for the target planes, and the planes are loaded 

20 via move instructions which use DMA transfers, in a manner similar to an 
interrapt driven I/O operation. While the DMA transfers 

processor returns to execute its normal instruction stream. An interrupt signalling 
completion of the transfer of the planes will re-enter the "driver" code which will 
update the fimction map table. If the fimction is ahready loaded, then the Load 
25 instruction returns without loading. If the ^m address does not exist then the 
operation aborts with an error exception. 

The Invoke command copies the contents of one plane to another. 
Referring to Figure 1 8b, block configmration words are maintained for 
each block in the ACM, including, for each block, a Routing Plane word and a 
30 Function Plane word. Run, Lock and Swap bits indicate the status of the current 
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effective configurations within each block. A "From Plane" field may be used to 
swap a function back to a previous plane. 

The infonnation in the block configuration words is used to determined 
how to handle the Execute instruction. The Execute instruction is decoded by the 
5 control logic interface to the reconfigurable logic. Either the function is resident, 
in which case it is executed witii Run set to 1 on the appropriate planes and 
blocks, or it is not, in which case a soft interrupt is executed which branches the 
processor into an exception handling routine with the return address at the 

Execute command, aUowing the instruction to be reissued when tiie function is 
10 loaded. 

The exception handling routine issues one or more Invoke commands witii 
the appropriate parameters, after determining if the current functions are locked 
or swappable as specified in the appropriate block configuration bits. If tiie block 
are current executing another function. Run is set to 1. If the Swap bit is 1. tiien 
15 functions is swappable. If the Lock bit is set to 1 , tiien the current plane is 

locked. One suitable exception handling routine is described by the pseudoc^ 
of listing Figure 19. 

The effect of tiie exception routine is to re-execute the routine after it has 
been loaded or swapped in. or to skip die insdnction. Note tiiat if the currentiy 
20 executing function is not locked or swappable. It may be aborted. 
Upon completion of the Execute msttuction, when 
to die processor by a mechanism such as those described above, flie run bits arc 
' cleared. , 

The result of tiie foregoing approach is to allow Wsofbyare to 
25 die run time by early loading and invoking of the functions, while always 
executing the functions, if at aU possible, \yhether they have been previously 
loaded or not. - 

It will be appreci^ited by fliose of ordinary skill in the art that the 
invention can be embodied in other specific forms witiiout departing from tiie 
spkit or essential character tiiereof. The presentiy disclosed embodiments are 
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therefore considered in all respects to be illustrative and not restrictive. The 
scope of the invention is indicated by the appended claims rather than the 
foregoing description, and all changes which come within the meaning and range 
of equivalents thereof are intended to be embraced therein. 
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What is claimed is: 

1. An integrated circuit, comprising: 
fine-grain reconfigurable control logic; 
coarse-grain reconfigurable datapath logic; and 

memory means coupled to the reconfigurable control logic and the 
reconfigurable datapath logic for defining multiple configurations of the 
reconfigurable control logic and the reconfigurable datapath logic, 

2. The apparatus ofClaiml, further comprising reconfig^ 
memory, wherein said memory means is coupled to the reconfigurable memory 
for defining multiple configurations of the reconfigurable memory. 

3 . The apparatus of Claun 2, furflier comprising a microprocessor 
coupled to at least one of said reconfigurable control logic, said reconfigurable 
datapath logic, and said reconfigurable memory. 

4. The apparatus of Claim 3, wherein the microprocessor is coupled 
to multiple ones of said reconfigurable control logic, said reconfigurable datapath 
logic, and said reconfigurable memory. 

5. The apparatus of any of the preceding claims, ^£^^ 
interconnection between the reconfigurable control logic and the reconfigurable 
datapath logic. 

6. The apparatus of any of the preceding claims, wherein said 
memory means comprises multq)le logical memory planes. 

7. The Apparatus of Claim 6, further comprising means for 
performing hardware-controlled transfer of data between logical memory planes. 
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8. The apparatus of Claim 7, wherein said transfer of data is direct 
plane-to-plane transfer, 

9, The apparatus of Qaim 8, wherein the direct plane-to-plane 
transfer is completed within a single cycle of the microprocessor. 

5 10. The apparatus of Claim 1, wherein said memory means comprises 

means for simultaneously addressing multiple memory locations located in 
different memory rows and different memory colimans to write identical data into 
the multiple memory locations, whereby an amount of data needed to completely 
configure at least one of said reconfigurable control logic and said reconfigurable 
10 datapath logic is substantially reduced. 

11. The apparatus of Claim 10, wherein at least one of the 
reconfigurable control logic and the reconfigurable datapath logic comprises 
multiple cells, each cell requiring a predetermined number of bits of 
configuration information to configure the cell, wherein at least a portion of said 

15 memory means is organized into data words having a word length equal to the 
predetermined niunber of bits. 

12. The apparatus of Claim 3, furdier comprising a bus coupled to the 
microprocessor iand , coupled to the bus, at least one of a bus controller for 
controlling an external bus and a memory controller for controlling ah external 

20 memory, 

13. A reconfigurable computing method using an adaptive conq)ute 
engine including a microprocessor, a memory, and an array of reconfigurable 
logic elements, the metfacxl comprising the steps of: 

executing instructions on a microprocessor; 
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in response to one or more instmctions, loading multq)le sets 
configuration data into the memory. 

14. The method of Claim 13, wherein the multiple sets of 

configuration data con5)rises at least one set of effective configuration data 
5 applied to the reconfigurable logic elements and at least one set of other 

configuration data not applied to the reconfigurable logic elements, the method 
conq,rising the further step of. in response to a predetermined iii^ 
physically swapping die effective configuration data and the other configuration 
data. 



10 15, The method of Claim 13, comprising the further ste^ 

response to one or more instructions, passing data and control i^^ 
betwera the microprocessor and the array of 

16. The apparatus of Claim 15. conq)rising the further steps ot 
response to one or niore instructions: 

perfonning at least one of loading a set 0^^^^ 

extenial memory to become the effective configuration data and physical^^^ 

swappirig a set of configuratipii data to cause it to become die eff^^^^ 
configuration data; and 

causing the array of reconfigurable logic elernent to perf^ 
processing in accordance with the effective configuration data. 



15 



20 



-22- 



wo 99/00739 



PCT/US98/1356S 



1/16 




PLANE 



105 



lOjJ^ 



PLANE 1 



101 




103' 



^^FPGA BIT 



FIG, 1 

(PRIOR ART) 



^5 



PDI 



£217 



DMAs & 
Controller 



<K>I 



300- 
j^OI 



PCI 



r205 



CMI 
(Memory) 



GPI 



BRG 



RCE CORE 



RISC CPU 



LSM 



301 



400 



ACM 



if 



600 



r2i9 



TSA 



GPO 



FIG. 2 



~203a 



f-209 



PLL 

(Clocks) \r 



r22i 



SCM 



f—213 



ICB 
(Interrupts) 



£1 



207 



TBM 
(Timers) 



■203b 



wo 99A)0739 



PCT/US98/13S65 



2/16 
320 



CONTROL SLICE 



CONTROL SLICE 



LSM ARRAY 



CONTROL SLICE 



fig: 3 

















A 








503^ 



FIG. 5 



310 
3190 



GPIO 


BUS 
l/F- 


1 
1 
1 
1 


DATAPATH SLICE 


1 
i 
1 
1 


BUS 
l/F 






1 






GPIO 


BUS 

i/r 


1 
1 
1 
1 


DATAPATH SLICE 


1 
1 

r 
1 


BUS 

I/r 



T 



319b 




311 



GPIO 


BUS 
l/F 


1 
1 
1 


DATAPATH SLICE 


1 
1 
1 
1 


BUS 
l/F 




1 
1 






GPIO 


BUS 
l/F 


1 
1 
1 


DATAPATH SUCE 


1 
1 
1 


BUS 
l/F 




1 




1 






1 . 

1 
1 


CONTROL SLICE 


1 
1 

1 . 



301 



J204 



GIGABIT 
MAC 




302 



SDRAM 
l/F 



306 




505 



SUBSTITUTE SHEET (RULE 26) 




FIG. 4 



SUBSTITUTE SHEET fRULE 26^ 



wo 99/00739 



PCT/US98/13565 



4/16 




Global control/Clock Distribution 



609 



610a 



BLU 




















CM 























61 lb 



J 



62 U 



630a 




610b 




BLU 




















Ctrl 























'620b 



630b 




T 


M 


M 


M 




T 


M 


M 


M 








































































LSM 

































FIG. 6 



SUBSTITUTE SHEET fRULE 2B^ 



wo 99/00739 



PCT/US98/13565 



5/16 




Global control/Clock Distribution 



BLU 




















CM 























■!=p rZai7 




BLU 




















CM 




























































































System 
Werhory 



















































FIG. 7 



wo 99/00739 



PCTAJS9^13S65 





System 
Merriory 



FIG. 8 



sUBSTmrre shfct muLE 2b\ 



wo 99/00739 



PCT/US98n3565 



7/16 



aO 



a1 bO 



b1 



ci- \ DPU / - 



CO 



cO 



cJ 



c1 c2 



FIG. 9 



a 

11 



b 

1_L 




C 

11 



d 
u. 



addj 
ab 




. add J 
cd 




■addj 
ef 



A 




. add j 
gh 



a 

I I 



b 

I \ 




c 
I I 



d 

Li, 



ab cd ef 
I ' ■ 11 



gh ab cd ef ah 
-111, » ' '. I ' ri 




. add J 
ab 




e 

XI 



f 

Li, 



[addi 
cd 




h 

LL 



. add j 
ef 




. add j 
gh 



ad eh ad eh 




add. 



addj 



F/a 10 



wo 99/00739 



PCr/US98/1356S 



8/16 



r 



FIG. 11 < 



1101 



CONTROL 
RECOHRG- 
URABLE 
LOGIC X 



N 



1107 



^CONTROL 
CONHG- 
URAWN 
STORAGE 

1 



S 



1103 



S 



^DATAPATH 
RECONRG-^ 
URABLE 
LOGIC <^ 



^^^1109 



'^DATAPATH 
CONHG- 
URATION 
STORAGE 



N 



1105 



^MEMORY 
(RECONHG- 
URABLE) 



1111 



^MEMORY 
CONHG- 
URATION 
STORAGE 



SUiBSTTTUTE SHEET fRULB 28^ 



wo 99/00739 



PCT/US98/U56S 



9/16 



1200 



1206- 
1205^ 



1200a. 



x-y memory 
planes 




^ DP function plane 
^ DP wire plane 

Configuration 
Vertical stack 
2 port memory 



Bus interface 
Single port 
Memory 



FIG. 12a 



■1^ 4 -0 



H>-i'' 




Configuration of Datapath Function 
Configuration of Datapath Wiring 

Read from and 
write to any cell 

Configuration 
Function plane 

Configuration 
Wire plane 



Address to all 
bits in a plane 
1R/1W port 



External access 



FIG. 12b 



<=;uBs;TmiTP c^mppt mux p 2b> 



wo 99/00739 



PCT/US98/13565 



10/16 

Configuration of 
Datapath Function 

















5 










-K. 




4 








4 


-t> 




J 


^> 






3 


-I> 




2 








2 


H> 




1 


-t> 






1 













[-^ Configuration of 
Datapath Wiring 



"Y" 



fig: 12 c 













-'t> 








-{> 


■|>- 2 


-0 


^ 1 




4>-. 0 





Configuration CTL 
Configuration DP 
— Configuration MEhA 



FIG. 12d 



SUBSTITUTE SHI=CT mi !! P 



wo 99/00739 



PCT/US98/13565 



/ 1/16 



I — ^ Configuration CTL 



{>-0}-t> 

























'-t> 


4S— , v; 


» r— IS- 



Configuration DP 



MlH> 



MiH> 



I — ^ Configuration MEM 



M£H> 



^{I]^> 



H>-| 



F/G. 12e 



wo 99/00739 



12/16 



PCT/US98/1356S 
Fabric_Define^data 



Read_Config_State 
Lock_and_ Config_Bar 



1305 ■< 




r 



Read^Plane_ 1 



Write_Plane_ 7 



1301 ^ 



Read_Plane_2 



Write_Plane_2 



1302 ^ 



Data_ Write 
Read_P/ane_3 



Write_Plane_3 



1303 J 



Read^Plane_4 



1304 

Config_Redd_Data 



Write_Plane^4 



1307 



Config_Load^Data 



^.LTCH 
Storage-Cell 



Data^ 
Recirc^ 
Read 






A 


0 Ql 




EN 





1313 



Y 



LTCH 




LTCH 






A 


Q ; Ql 




EN 




LTCH 





T 

mv. 






A 


(J Ql 
EN 





1309 




L__________(J£/V^_.___ 

1311 




Swdp_Read^Plane 

FIG. 13 



9900739A1_I_> 



SUBSntUTE SHEET (RULE 26) 



wo 99/00739 



PCT/US98/13S6S 



CO 



03 



OS 



QQ 



en 



CQ 



a 



03 



01 



a 



03 



-£- 



1— T 



1J/16 



5^ 



03 



a 



03 



I 



O 



03 



a 

:qci_ 



03 



a 



C3CI 



03 



a 



01 



03 



Q3L 



T— r 



J 

Or 



OJ 



a 



03 



L: 



C3$ 

5^ 



03 



5? 



03 



a. 



to 



I I I '^osL, 



s: 



D 
EN\- 



LTCH 



D 
EN- 



<J1 



0 D 
EN 



LTCH 



<h 



D- 
EN- 



Q D- 
EN- 



o 

C3^ 



51 § 



iRRTTn rrtr cuppt mi it c 9r\ 



wo 99/00739 



PCT/US98/13565 



16 
12 
8 
4 
0 



14/16 




16 
12 

8- 
4- 
0 



4 8 12 16 

FIG. 15a 



0 4 8 12 16 

Fia 15b 



16 
12 

8; 

^; 
0- 



9 



4 8 12 16 

FIG. 15c 




4 8 12 16 

FIG. 15d 



16 
12 
8 
4- 
0- 



0 



4 8 12 16 

FIG. 15e 



4 8 12 16 

FIG. 15f 



A 
B 
C 
etc 



Function Map table 
1 



2 

planes 



module 
address 



FIG. 18 a 



routing plane 
function plane 



block configuration 




t ^ vj cs: 
FIG. 18b 



9900739A1J_> 



SUBSTITUTE SHEET (RULE 26) 



wo 99/00739 



PCT/US98/13565 



15/16 




FIG. 16 



Function 
A 



Function 
Map 1 



Function 
A 



Function 
B 



B' 



"Y" 



FIG. 17a 



Function 
C 



Functidn 
Map 2 



Function 
C 




FIG. 17b 



suBsrrnrre sheet f rule 2m 



wo 99/00739 



16/16 



PCT/US98/13565 



I 
I 

.11 

■lb 



i 



QJ>— 




.1 
I 

CO 

.5 



8 



.9- 
.1 



(3 



C •-•^ ;0 1^ « 



0) ^ 



Si 



■ 

CO 



?5 



42 
4> 



-t3 



SUBSTITUTE SHEET mULE 2B^ 



INTERNATIONAL SEARCH REPORT 


International application No. 
PCT/US98/13565 


A. CLASSIFICATION OF SUBJECT MATTER 

1PC(6) :G06F 1300 

USCL :395/800.15,800,37» 284 
According to International Patent Classiftcation (IPC) or to both national classiiication and IPC 


a FIELDS SEARCHED 


Minimum documentation searched (classification system followed by classification symbols) 
U.S. : 395/800.15,800.37. 284 


Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 


Elec^nic data base consulted during the international search (name of data base and, where practicable, search terms used) 

• APS . ^ . 
searched terms : reconfigurable, fine grained, coarse grained, processor 


C DOCUMENTS CONSIDERED TO BE RELEVANT 




Category* 


Citation of document* with indication, where appropriate, of the relevant passage 


Relevant to claim No. 


Y 
Y 

Y.P 
Y 


US 5,6 17,577 A(BARKER ET AO O 1 April I9Q7. nos. l-e, 

COL. II, UNE 20COL, 1 5, UNE 13 

US 5,6 1 3, 1 46 A(GOVE ET AU 1 6 MARCH 1 997, Fios, 57-6A, 
COL, 6,UNE 1 1 -COL. 12, UNE 63. 

us 5,6BO,634 ACESTES ETAL) 2 1 OCTOBER 1997. COL. 6, UNES 
3 1-67, 

us 5,5C30.6O9 A(KEAN) 19 MARCH 1 996; COL. 39, UNE 35- 
COL. 40,LINE 3 1). . 


1-16 
1-16 
1.13 
1 3-16 


n 


her documenU are listed in the continuation of Box C. ^] See patent family annex. 


So«i»Ic«l«ori« of cited doctmiei»« •t" docum«m |mbli«li«l »ft«r tht iniMMt^ 

jrto «nd not in conflict with Iho tppbcadoo bat crtod to ui^^ 
*A* tioCTMum 'f^fit'oo tha fl^tiTml miMtm of tfao «ft which b not comidartd the pnncipU or iboofy undtflyng tho invoatioo 
to bo of ptrticttlv ralovsDCo 

•X" ~.J dooiBBCOt of paitieulsr raltvaDcs; tbo cisimod imreiitioo coanot b« 
*B* «arfi« <beu»«it puUbbod oa or after tho tatcnmkkul fifing d>i« ^j^m]^^^^ novel or cumotbc contadered to involvo «n iav«stiv« step 
•f doca«emwhk*m.y throw doubt, ooprior^rcto^ when the doo-neni b Uken ekme 

cb«l to ertnblWi the fwttotkm d«. of «^ docoioem of portkuUr reie»«nc«; the cl««ed biventioo omn^ 
speciel rsesoo (« ^Mctrtad) comidered to involve en nvcntivo step when the docunent b 
•0» docianent referring U> en ona dieckwiro. uee, ej^^ combined with one or men ether such documents, such cowbiatioo 
gjeisis , obvious to a perm skiUed in the art 

"P". docuiMm published prwr to the internaAtooU ruins data bmt^ *&' document raember of the same patent Cm i|jr 


Date of the actual completion of the international search 
16 SEPTEMBER 199S 


Date of mailing of the international search report 

19 OCT 1998 


Name and mailing address of the ISA/US 
Conunisnotier of Patents and lYadetnarics 

Box per . 

Washington, D.C . 20231 
Facsimile No. (703) 305-3230 


Authorized officer /) 

ERIC COLEMAN ^ — j^y^ ' 

Telephone NoJ (703)305-#674 



